Search CORE

45 research outputs found

Speaker characterization using adult and children’s speech

Author: Safavi Saeid
Publication venue
Publication date: 01/07/2015
Field of study

Speech signals contain important information about a speaker, such as age, gender, language, accent, and emotional/psychological state. Automatic recognition of these types of characteristics has a wide range of commercial, medical and forensic applications such as interactive voice response systems, service customization, natural human-machine interaction, recognizing the type of pathology of speakers, and directing the forensic investigation process. Many such applications depend on reliable systems using short speech segments without regard to the spoken text (text-independent). All these applications are also applicable using children’s speech. This research aims to develop accurate methods and tools to identify different characteristics of the speakers. Our experiments cover speaker recognition, gender recognition, age-group classification, and accent identification. However, similar approaches and techniques can be applied to identify other characteristics such as emotional/psychological state. The main focus of this research is on detecting these characteristics from children’s speech, which is previously reported as a more challenging subject compared to adult. Furthermore, the impact of different frequency bands on the performances of several recognition systems is studied, and the performance obtained using children’s speech is compared with the corresponding results from experiments using adults’ speech. Speaker characterization is performed by fitting a probability density function to acoustic features extracted from the speech signals. Since the distribution of acoustic features is complex, Gaussian mixture models (GMM) are applied. Due to lack of data, parametric model adaptation methods have been applied to adapt the universal background model (UBM) to the char acteristics of utterances. An effective approach involves adapting the UBM to speech signals using the Maximum-A-Posteriori (MAP) scheme. Then, the Gaussian means of the adapted GMM are concatenated to form a Gaussian mean super-vector for a given utterance. Finally, a classification or regression algorithm is used to identify the speaker characteristics. While effective, Gaussian mean super-vectors are of a high dimensionality resulting in high computational cost and difficulty in obtaining a robust model in the context of limited data. In the field of speaker recognition, recent advances using the i-vector framework have increased the classification accuracy. This framework, which provides a compact representation of an utterance in the form of a low dimensional feature vector, applies a simple factor analysis on GMM means

University of Birmingham Research Archive, E-theses Repository

Acoustic model selection using limited data for accent robust speech recognition

Author: Hanani Abualsoud
Najafian Maryam
Russell Martin
Safavi Saeid
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

University of Birmingham Research Portal

Automatic speaker, age-group and gender identification from children's speech

Author: Jancovic Peter
Russell Martin
Safavi Saeid
Publication venue: 'Elsevier BV'
Publication date: 01/07/2018
Field of study

Crossref

University of Birmingham Research Portal

Acoustic model selection using limited data for accent robust speech recognition

Author: Hanani Abualsoud
Najafian Maryam
Russell Martin
Safavi Saeid
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

This paper investigates techniques to compensate for the effects of regional accents of British English on automatic speech recognition (ASR) performance. Given a small amount of speech from a new speaker, is it better to apply speaker adaptation, or to use accent identification (AID) to identify the speaker’s accent followed by accent-dependent ASR? Three approaches to accent-dependent modelling are investigated: using the ‘correct’ accent model, choosing a model using supervised (ACCDIST-based) accent identifi- cation (AID), and building a model using data from neighbouring speakers in ‘AID space’. All of the methods outperform the accentindependent model, with relative reductions in ASR error rate of up to 44%. Using on average 43s of speech to identify an appropriate accent-dependent model outperforms using it for supervised speaker-adaptation, by 7%

University of Birmingham Research Portal

FADA - Birzeit University

Identification of Age-Group from Children's Speech by Computers and Humans

Author: Jancovic Peter
Russell Martin
Safavi Saeid
Publication venue
Publication date: 01/09/2014
Field of study

University of Birmingham Research Portal

Effect of erythropoietin on Glasgow Coma Scale and Glasgow Outcome Sale in patient with diffuse axonal injury

Author: Azim Honarmand
Mohammadreza Safavi
Saeid Abrishamkar
Publication venue: Wolters Kluwer Medknow Publications
Publication date: 01/01/2012
Field of study

Background: Erythropoietin (EPO) as a major stimulator of red blood cell (RBC) production play a key role on brain protection and have a caring effect on neurons from hypoxic or traumatic injury. The objective of this trial was to study the safety and efficacy of recombinant human EPO (rhEPO) on level of consciousness and other outcomes in patient with post traumatic diffuse axonal injury (PTDAI). Methods: In a controlled double-blind randomized clinical trial, 54 patients aged 20-47 years were randomly allocated to 2 groups. Subjects in intervention group (n = 27) received 2000U open-label rhEPO (Erythropoietin-ί; Roche, Gren-zach-Wyhlen, Germany) subcutaneously for six doses in two weeks (on days: 2, 4, 6, 8 and 10). The efficacies of the intervention were evaluated by GCS (Glasgow Coma Scale) and GOS (Glasgow Outcome Scale). Results: The patients that were treated by rhEPO improved earlier with the difference between the treatment groups occurring on the day 10 (score differences of 9.6 for GCS and 1.9 for GOS). The better course of the rhEPO-treated patients continued throughout the remaining study period. The hematocrit and red blood cell counts did not increase to levels exceeding the normal range in rhEPO patients. Conclusions: Intravenous EPO was well tolerated in diffuse axonal injury and was associated with an improvement in patients′ outcome in 2 weeks

Directory of Open Access Journals